Weighted-^i minimization with multiple weighting sets 



Hassan Mansour"' 6 and Ozgiir Yilmaz 

"Mathematics Department, University of British Columbia, Vancouver - BC, Canada; 
^Computer Science Department, University of British Columbia, Vancouver - BC, Canada 

ABSTRACT 

In this paper, we study the support recovery conditions of weighted l\ minimization for signal reconstruction 
from compressed sensing measurements when multiple support estimate sets with different accuracy are available. 
We identify a class of signals for which the recovered vector from l\ minimization provides an accurate support 
estimate. We then derive stability and robustness guarantees for the weighted l\ minimization problem with 
more than one support estimate. We show that applying a smaller weight to support estimate that enjoy higher 
accuracy improves the recovery conditions compared with the case of a single support estimate and the case with 
standard, i.e., non- weighted, i\ minimization. Our theoretical results are supported by numerical simulations on 
synthetic signals and real audio signals. 
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1. INTRODUCTION 

A wide range of signal processing applications rely on the ability to realize a signal from linear and sometimes 
noisy measurements. These applications include the acquisition and storage of audio, natural and seismic images, 
and video, which all admit sparse or approximately sparse representations in appropriate transform domains. 

Compressed sensing has emerged as an effective paradigm for the acquisition of sparse signals from significantly 
fewer linear measurements than their ambient dimensionPEl Consider an arbitrary signal x £ M. N and let y £ R" 
be a set of measurements given by 

y = Ax + e, 

where A is a known n x N measurement matrix, and e denotes additive noise that satisfies ||e||2 < e for some 
known e > 0. Compressed sensing theory states that it is possible to recover x from y (given A) even when 
n <C N, i.e., using very few measurements. 

When x is strictly sparse, i.e. when there are only k < n nonzero entries in a;, and when e = 0, one may 
recover an estimate x* of the signal x as the solution of the constrained £q minimization problem 

minimize llullo subject to Au = y. (1) 

In fact, using (JlJ, the recovery is exact when n > 2k and A is in general position^ However, £ minimization 
is a combinatorial problem and quickly becomes intractable as the dimensions increase. Instead, the convex 
relaxation 

minimize llztlli subject to \\Au — y\\-? < e (2) 

can be used to recover the estimate x*. Candes, Romberg and Tao^and DonohcPshow that if n > k\og(N/k), 
then l-y minimization ^ can stably and robustly recover x from inaccurate and what appears to be "incomplete" 
measurements y — Ax+e, where, as before, A is an appropriately chosen nxN measurement matrix and ||e||2 < £• 
Contrary to £q minimization, ([2]), which is a convex program, can be solved efficiently. Consequently, it is possible 
to recover a stable and robust approximation of x by solving ^ instead of ([I} at the cost of increasing the number 
of measurements taken. 
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Several works in the literature have proposed alternate algorithms that attempt to bridge the gap between £ 
and t\ minimization. For example, the recovery from compressed sensing measurements using £ p minimization 
with < p < 1 has been shown to be stable and robust under weaker conditions that those of l\ minimization!^^ 
However, the problem is non-convex and even though various simple and efficient algorithms were proposed and 
observed to perform well empiricallyJZE2l so far only local convergence can be proved. Another approach for 
improving the recovery performance of t\ minimization is to incorporate prior knowledge regarding the support 
of the signal to-be-recovered. One way to accomplish this is to replace l\ minimization in ^ with weighted l\ 
minimization 

minimize ||u||i w subject to \\Au — 2/II2 < e 5 (3) 

where w g [ 0, 1]^ and ||u||i jW := J^i w i| u i| i s the weighted l\ norm. This approach has been studied by several 
groupsCLLHH] anc j mos t recently, by the authors, together with Saab and Friedlander 15 . In this work, we proved 



that conditioned on the accuracy and relative size of the support estimate, weighted t\ minimization is stable 
and robust under weaker conditions than those of standard t\ minimization. 

The works mentioned above mainly focus on a "two- weight" scenario: for x g 1 N , one is given a partition 
of {1, . . . , N} into two sets, say T and T c . Here T denotes the estimated support of the entries of x that are 
largest in magnitude. In this paper, we consider the more general case and study recovery conditions of weighted 
i\ minimization when multiple support estimates with different accuracies are available. We first give a brief 
overview of compressed sensing and review our previous result on weighted l\ minimization in Section [2] In 
Section [3j we prove that for a certain class of signals it is possible to estimate the support of its best k-term 
approximation using standard l\ minimization. We then derive stability and robustness guarantees for weighted 
£1 minimization which generalizes our previous work to the case of two or more weighting sets. Finally, we 
present numerical experiments in Section [4] that verify our theoretical results. 

2. COMPRESSED SENSING WITH PARTIAL SUPPORT INFORMATION 

Consider an arbitrary signal x g M. N and let Xk be its best fc-term approximation, given by keeping the k 
largest-in-magnitude components of x and setting the remaining components to zero. Let To = supp(a; / t), where 
To C {1, . . . , N} and |T | < k. We wish to reconstruct the signal x from y = Ax + e, where A is a known n x N 
measurement matrix with n <^ N, and e denotes the (unknown) measurement error that satisfies ||e||2 < e for 
some known margin e > 0. Also let the set T C {1, . . . , N} be an estimate of the support To of Xk- 

2.1 Compressed sensing overview 

It was shown in [2] that x can be stably and robustly recovered from the measurements y by solving the 
optimization problem ([!]) if the measurement matrix A has the restricted isometry property^ (RIP). 

Definition 1. The restricted isometry constant 5k of a matrix A is the smallest number such that for all 
k-sparse vectors u, 

(l-«*)Nll<||>l«|||<(l + <5 fc )|H||. (4) 

The following theorem uses the RIP to provide conditions and bounds for stable and robust recovery of x by 
solving Q. 

Theorem 2.1 (C ANDES, Romberg, Tao^). Suppose that x is an arbitrary vector in M, N , and let Xk be the 
best k-term approximation of x. Suppose that there exists an a g \l with a > 1 and 

S a k + a^(i+ a )fe < a - 1- (5) 

Then the solution x* to obeys 

\\x* - x\\ 2 < C Q e + dk^Wx -ZfcHi. (6) 



Remark 1 . The constants in Theorem, \2.1\ are explicitly given by 

° v /l-5(a+i)fe- a_1/2 v / l+^' 1 A /l-^(L+i)fc- a ~ 1/2 v / T+^ ' ^ ' 



Theorem 2.1 shows that the constrained l\ minimization problem in ^ recovers an approximation to x with 



an error that scales well with noise and the "compressibility" of x, provided ([5| is satisfied. Moreover, if x is 



sufficiently sparse (i.e., x = Xk), and if the measurement process is noise-free, then Theorem 2.1 guarantees exact 
recovery of x from y. At this point, we note that a slightly stronger sufficient condition compared to ^ — that 
is easier to compare with conditions we obtain in the next section — is given by 

d (a+i)k < a ^y- (8) 

2.2 Weighted l\ minimization 

The l\ minimization problem ([2]) does not incorporate any prior information about the support of x. However, 
in many applications it may be possible to draw an estimate of the support of the signal or an estimate of the 
indices of its largest coefficients. 

In our previous workp^ we considered the case where we are given a support estimate T C {1, . . . , N} for 
x with a certain accuracy. We investigated the performance of weighted l\ minimization, as described in ([3]), 
where the weights are assigned such that Wj = to 6 [0, 1] whenever j € T, and Wj — 1 otherwise. In particular, 
we proved that if the (partial) support estimate is at least 50% accurate, then weighted t\ minimization with 
cj < 1 outperforms standard t\ minimization in terms of accuracy, stability, and robustness. 

Suppose that T has cardinality |T| = pk, where < p < N/k is the relative size of the support estimate T. 
Furthermore, define the accuracy of T via a := T |~?° , i.e., a is the fraction of T inside To- As before, we wish to 

recover an arbitrary vector x £ R from noisy compressive measurements y — Ax + e, where e satisfies ||e||2 < £• 
To that end, we consider the weighted £± minimization problem with the following choice of weights: 

f 1, is T c , 

minimize ||z|| l w subject to jjAz — y|| 2 < e with Wj = < ' ~ ' (9) 



Here, < lo < 1 and ||z||i, w is as defined in Figure [I] illustrates the relationship between the support To, 
support estimate T and the weight vector w. 




Figure 1. Illustration of the signal x and weight vector w emphasizing the relationship between the sets To and T. 



Theorem 2.2 (FMSYl^l). Let x be in R and let x k be its best k-term approximation, supported on T$. Let 
T C {1, . . . , N} be an arbitrary set and define p and a as before such that \T\ — pk and \T (1 Tq\ — apk. Suppose 
that there exists an a € jnZ, with a > (1 — a)p, a > 1, and the measurement matrix A has RIP with 



$ak 



)^(a+l)fe < 



(to + (1 - u)y/l + p- 2ap) (cj + (l-cj) x /l + p-2ap)' 

/or some given < cj < 1. Then the solution x* to obeys 

\\x* -x\\ 2 < C' e + C'^ 1 ' 2 {w\x - x k \ x + (1 - w)l|zf. n T «lli) , 



1, 



(10) 



(11) 



where Cq and C[ are well-behaved constants that depend on the measurement matrix A, the weight ui, and the 
parameters a and p. 

Remark 2. The constants Cq and C[ are explicitly given by the expressions 



C'o 



2 1 



UJ+(l-^)Vl+p-2ap 



^1 - 5(a+l)k - 



^+(i-^) v /i+p-2ap /rn — : 



2a- 1 / 2 yi - 5 {a+1)k + VT+I^k) 

u+(l-u)Vl+p-2ap /I , r ' 



v/1^ 



(12) 



(a+l)fe 



Consequently, Theorem 2.2, with uj = 1, reduces to the stable and robust recovery theorem of J^, which we stated 
above — see Theorem\2.l\ 



Remark 3. It is sufficient that A satisfies 

£(a+i)fc < 



= a - (a + (1 - cj)yi + p- 2ap) 
a + (w + (1 - + P- 2ap) 



(13) 



/or Theorem \2.2\ to hold, i.e., to guarantee stable and robust recovery of the signal x from measurements y 
Ax + e. 



It is easy to see that the sufficient conditions of Theorem 2.2 given in (10) or (13), are weaker than their 
counterparts for the standard l\ recovery, as given in ^ or (|8| respectively, if and only if a > 0.5. A similar 
statement holds for the constants. In words, if the support estimate is more than 50% accurate, weighted l\ is 
more favorable than £i, at least in terms of sufficient conditions and error bounds. 

The theoretical results presented above suggest that the weight cj should be set equal to zero when a > 0.5 
and to one when a < 0.5 as these values of u give the best sufficient conditions and error bound constants. 
However, we conducted extensive numerical simulations in 15 which suggest that a choice of u) ~ 0.5 results in 
the best recovery when there is little confidence in the support estimate accuracy. An heuristic explanation of 



this observation is given in 15 



3. WEIGHTED l x MINIMIZATION WITH MULTIPLE SUPPORT ESTIMATES 

The result in the previous section relies on the availability of a support estimate set T on which to apply the 
weights uj. In this section, we first show that it is possible to draw support estimates from the solution of We 
then present the main theorem for stable and robust recovery of an arbitrary vector x € WL N from measurements 
y = Ax + e, y € W and n -C N, with multiple support estimates having different accuracies. 

3.1 Partial support recovery from l\ minimization 

For signals x that belong to certain signal classes, the solution to the l\ minimization problem can carry significant 
information on the support Tq of the best fc-term approximation x k of x. We start by recalling the null space 



property (NSP) of a matrix A as defined in 17 . Necessary conditions as well as sufficient conditions for the 



existence of some algorithm that recovers x from measurements y = Ax with an error related to the best /c-term 



approximation of x can be formulated in terms of an appropriate NSP. We state below a particular form of the 
NSP pertaining to the instance optimality. 

Definition 2. A matrix A £ R nxN , n < N, is said to have the null space property of order k and constant cq 
if for any vector h € Af(A), Ah = 0, and for every index set T C {1 . . . N} of cardinality \T\ = k 

\\h\\i < CqIIMIi- 



Among the various important conclusions of 17 , the following (in a slightly more general form) will be 
instrumental for our results. 

Lemma 3.1 ( 17 ). If A has the restricted isometry property with 5( a +i)k < for some a > 1, then it has 
the NSP of order k and constant cq given explicitly by 



CO = 1 + 



a A /l - 5( a+ i)fe 



In what follows, let x* be the solution to |2]) and define the sets S — supp(x s ), T = supp(a;/ c ), and T — 
supp(x* k ) for some integers k > s > 0. 

Proposition 3.2. Suppose that A has the null space property (NSP) of order k with constant Cq and 

mm\x(j)\ > (77+1)11^11!, (14) 

where n = „ 2c ° . Then SCT. 
' 2-c — 

The proof is presented in section [S] of the appendix. 
Remark 4. Note that if A has RIP so that <5( a +i)fc < for some a > 1, then r\ is given explicitly by 



_ 2( 1 /a v / l - <5( 0+ i)/c + VI + S ak ) , 15 > 

y/a^/l - 5( a+ i)fe - \/T+ Sak 



Proposition 3.2 states that if x belongs to the class of signals that satisfy (14 1, then the support S of x s — i.e., 
the set of indices of the s largest-in-magnitude coefficients of x — is guaranteed to be contained in the set of 
indices of the k largest-in-magnitude coefficients of x* . Consequently, if we consider T to be a support estimate 
for Xk, then it has an accuracy a > %. 



Note here that Proposition 3.2 specifies a class of signals, defined via (14 1, for which partial support informa- 
tion can be obtained by using the standard t\ recovery method. Though this class is quite restrictive and does 
not include various signals of practical interest, experiments suggest that highly accurate support estimates can 



still be obtained via l\ minimization for signals that only satisfy significantly milder decay conditions than (14). 
A theoretical investigation of this observation is an open problem. 

3.2 Multiple support estimates with varying accuracy: an idealized motivating example 

Suppose that the entries of x decay according to a power law such that = cj~ p for some scaling constant 

c, p > 1 and j <E {1, . . . ,iV}. Consider the two support sets T\ — supp(x/ Cl ) and T2 = supp(xk 2 ) for k\ > k2, 

T2 C T\. Suppose also that we can find entries |x(si)| — cs^ p « c(rj + an d ^(^2) = cs 2 p 1=3 c(rj + l)-£z 



that satisfy (14 1 for the sets Ti and T 2 , respectively, where si < ki and s 2 < k 2 . Then 



si - s 2 



which follows because 0<1 — 1/p <1 and k\ — k2 > 1. 

Consequently, if we define the support estimate sets T\ — supp(x£ ) and T2 = supp(x£ ), clearly the corre- 
sponding accuracies a.\ = ^ and ai = f| are not necessarily equal. Moreover, if 

(^j) 1/P <^ (16) 

s\ — S2 < ai(ki — k2), and thus a± < 0:2- For example, if we have p = 1.3 and 77 = 5, we get ( J ~ 0.1. 
Therefore, in this particular case, if a\ > 0.1, choosing some k2 < k\ results in a.2 > oli, i.e., we identify two 
different support estimates with different accuracies. This observation raises the question, "How should we deal 
with the recovery of signals from CS measurements when multiple support estimates with different accuracies 
are available?" We propose an answer to this question in the next section. 

3.3 Stability and robustness conditions 

In this section we present our main theorem for stable and robust recovery of an arbitrary vector x € from 
measurements y = Ax + e, y € M. n and n <C N, with multiple support estimates having different accuracies. 
Figure [2] illustrates an example of the particular case when only two disjoint support estimate sets are available. 
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Figure 2. Example of a sparse vector x with support set To and two support estimate sets T\ and T2. The weight vector 
is chosen so that weights cji and U2 are applied to the sets Ti and T2, respectively, and a weight equal to one elsewhere. 

Let Tq^be the support of the best fc-term approximation Xk of the signal x. Suppose that we have a support 
estimate T that can be written as the union of m disjoint subsets Tj, j G {1, . . . , m}, each of which has cardinality 

\Tj\ = pjk, < pj < a for some a > 1 and accuracy ctj = f~ , • 

I 

Again, we wish to recover x from measurements y = Ax + e with ||e||2 < e. To do this, we consider the 
general weighted l\ minimization problem 

min ||tt||i w subject to \\Au — y\\ < e (17) 

u£R N 



u)i , i e Ti 

N 

where ||u||i, w = X) w il w i|i an d w i = ' ~ for < Wj < 1, for all j £ {1, . . . , m} and T = [j Tj, 

u m , i G T m J=l 

1, !£T C 

Theorem 3.3. Lei a; G R™ and y = Ax + e, where A is an n x N matrix and e is additive noise with 
IMI2 < £ for some known e > 0. Denote by Xk the best k-term approximation of x, supported on Tq and let 

T\, . . . , T rn C {1, N} be as defined above with cardinality \Tj\ = pjk and accuracy ctj = f^T° > j € {1> • • • j m }- 

m m 

For some given < wi, . . . ,w m < 1, define y := Y] loj — (to — 1) + ^ (1 — + pj — 2ajPj. If the RIP 

i=i i=i 
constants of A are such that there exists an a £ }Z, wrai/i a > 1, and 

*ak + ^(a+l)fc < "2 - !. ( 18 ) 

iften t/ie solution x# to \17y obeys 



|x # — x 



| 2 <C (7)e + C 1 ( 7 )fc- 1/2 (E^INnTjIli + H^nT =Hi ] ■ ( 19 ) 



The proof is presented in section [5] of the appendix. 

Remark 5. The constants Co (7) and Ci(7) are well-behaved and given explicitly by the expressions 
CM 2 ( 1 + ^) rM 2a- yi 5 (a+l)k + VT+I^) 

V 1 _ d (a+l)fc _ TSV 1 + Oafc V 1 _ "(a+l)fc ~ V 1 + °afc 



Remark 6. Theorem 3.3 is a generalization of Theorem 2.2 for to > 1 support estimates. It is easy to see that 
when the number of support estimates m = 1, Theorem \3.3\ reduces to the recovery conditions of Theorem 2.2 
Moreover, setting uij = 1 for all j £ {1, . . . , to} reduces the result to that in Theorem 2.1 

Remark 7. The sufficient recovery condition (|13[) becomes in the case of multiple support estimates 



<5(a+l)fc < 



#7) 



7 



a + 7 2 



(21) 



where 7 is as defined in Theorem 3.3 It can be shown that when m — 1, 7 reduces to the expression in (13 1. 



Remark 8. The value 0/7 controls the recovery guarantees of the multiple-set weighted t\ minimization problem. 



For instance, as 7 approaches 0, condition (21) becomes weaker and the error bound constants Co( 7 ) and Ci(7) 
become smaller. Therefore, given a set of support estimate accuracies aj for all j £ {1 . . . to}, it is useful to find 
the corresponding weights ujj that minimize 7. Notice that for all j, 7 is a sum of linear functions of ujj with 
aj controlling the slope. When aj > 0.5, the slope is positive and the optimal value of LOj — 0. Otherwise, when 
ctj < 0.5, the slope is negative and the optimal value ofujj — 1. Hence, as in the single support estimate case, the 
theoretical conditions indicate that when the aj are known a choice of U)j equal to zero or one should be optimal. 
However, when the knowledge of aj is not reliable, experimental results indicate that intermediate values of ujj 
produce the best recovery results. 
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Figure 3. Comparison between the recovered SNR (averaged over 100 experiments) using two-set weighted l\ with support 
estimate T and accuracy a, three-set weighted l\ minimization with support estimates T\ UT2 = T and accuracy «i = 0.8 
and «2 < ot, and non-weighted l\ minimization. 



4. NUMERICAL EXPERIMENTS 

In what follows, we consider the particular case of m = 2, i.e. where there exists prior information on two 
disjoint support estimates T\ and T2 with respective accuracies ot\ and a,?.. We present numerical experiments 
that illustrate the benefits of using three-set weighted i\ minimization over two-set weighted t\ and non- weighted 
£1 minimization when additional prior support information is available. 

To that end, we compare the recovery capabilities of these algorithms for a suite of synthetically generated 
sparse signals. We also present the recovery results for a practical application of recovering audio signals using 
the proposed weighting. In all of our experiments, we use SPGLlBlEH] to solve the standard and weighted l\ 
minimization problems. 

4.1 Recovery of synthetic signals 

We generate signals x with an ambient dimension N = 500 and fixed sparsity k = 35. We compute the (noisy) 
compressed measurements of x using a Gaussian random measurement matrix A with dimensions n x N where 
n = 100. To quantify the reconstruction quality, we use the reconstruction signal to noise ratio (SNR) average 
over 100 realizations of the same experimental conditions. The SNR is measured in dB and is given by 



SNR(x, x) = 10 log 10 I n^rgjjf ) ' (22) 

where x is the true signal and x is the recovered signal. 

The recovery via two-set weighted £\ minimization uses a support estimate T of size |T| = 40 (i.e., p = 1) 
where the accuracy a of the support estimate takes on the values {0.3, 0.5, 0.7}, and the weight uj is chosen from 
{0.1,0.3,0.5}. 

Recovery via three-set weighted £\ minimization assumes the existence of two support estimates T\ and T%, 
which are disjoint subsets of T described above. The set T\ is chosen such that it always has an accuracy ot\ = 0.8 
while T 2 = T \ T±. In all experiments, we fix u}\ = 0.01 and set lo 2 = to. 



Figure 4.1 illustrates the recovery performance of three-set weighted i\ minimization compared to two- 
set weighted l\ using the setup described above and non-weighted t\ minimization. The figure shows that 
utilizing the extra accuracy of T\ by setting a smaller weight uj\ results in better signal recovery from the same 
measurements. 



4.2 Recovery of audio signals 

Next, we examine the performance of three-set weighted l\ minimization for the recovery of compressed sensing 
measurements of speech signals. In particular, the original signals are sampled at 44.1 kHz, but only l/4th of 
the samples are retained (with their indices chosen randomly from the uniform distribution). This yields the 
measurements y = i?s, where s is the speech signal and R is a restriction (of the identity) operator. Consequently, 
by dividing the measurements into blocks of size N, we can write y = [yf , yj, ...] T . Here each yj = RjSj is the 
measurement vector corresponding to the jth block of the signal, and Rj £ M. njXN is the associated restriction 
matrix. The signals we use in our experiments consist of 21 such blocks. 

We make the following assumptions about speech signals: 

1. The signal blocks are compressible in the DCT domain (for example, the MP3 compression standard uses 
a version of the DCT to compress audio signals.) 

2. The support set corresponding to the largest coefficients in adjacent blocks does not change much from 
block to block. 

3. Speech signals have large low-frequency coefficients. 

Thus, for the reconstruction of the jth block, we identify the support estimates T\ is the set corresponding 
to the largest nj/16 recovered coefficients of the previous block (for the first block 7\ is empty) and T 2 is the 
set corresponding to frequencies up to 4kHz. For recovery using two-set weighted i\ minimization, we define 
T = T\ U Ti and assign it a weight of lu. In the three-set weighted l\ case, we assign weights lu\ = tu/2 on the 
set T\ and lui = uj on the set T \ T\. The results of experiments on an example speech signal with N — 2048, 
and lu € {0, 1/6, 2/6, . . . , 1} are illustrated in Figure [4] It is clear from the figure that three-set weighted l\ 
minimization has better recovery performance over all 10 values of w spanning the interval [0, 1]. 




CO 



Figure 4. SNRs of the two reconstruction algorithms two-set and three-set weighted l\ minimization for a speech signal 
from compressed sensing measurements plotted against uj. 



5. CONCLUSION 

In conclusion, we derived stability and robustness guarantees for the weighted l\ minimization problem with 
multiple support estimates with varying accuracy. We showed that incorporating additional support information 
by applying a smaller weight to the estimated subsets of the support with higher accuracy improves the recovery 
conditions compared with the case of a single support estimate and the case of (non-weighted) i\ minimization. 
We also showed that for a certain class of signals — the coefficients of which decay in a particular way — it is 
possible to draw a support estimate from the solution of the l\ minimization problem. These results raise 
the question of whether it is possible to improve on the support estimate by solving a subsequent weighted l\ 
minimization problem. Moreover, it raises an interest in defining a new iterative weighted l\ algorithm which 
depends on the support accuracy instead of the coefficient magnitude as is the case of the Candes, Wakin, and 
Boyd 20 (IRL1) algorithm. We shall consider these problems elsewhere. 

APPENDIX A. PROOF OF PROPOSITION 3.2 

We want to find the conditions on the signal x and the matrix A which guarantee that the solution x* to the l\ 
minimization problem ([2| has the following property 

min |a;*(j)| > max |a;*(j')| = \x*(k + 1)|. 

Suppose that the matrix A has the Null Space property (NSP^P^of order k, i.e., for any h g Af(A), Ah = 0, 
then 

< CoH/lTflll, 

where T Q C {1,2, . . . N} with \T \ = k, and Af(A) denotes the Null-Space of A. 

If A has RIP with <5( a +i)& < ^xj for some constant a > 1, then it has the NSP of order k with constant cq 
which can be written explicitly in terms of the RIP constant of A as follows 

\/i + S ak 



CO = 1 + 



1 _ ^(a+l)fe 



Define h — x* — x, then h £ N{A) and we can write the £i-£i instance optimality as follows 

11% < 2^11^, 

with Co < 2. Let rj — , the bound on || hr \\\ is then given by 

||/iTol|i<(r? + l)lkT H|i-||^o||i. (23) 

The next step is to bound 11^5^ 111* Noting that T = supp(x* k ), then ||a;~||i < llaiyclli, and 

Using the reverse triangle inequality, we have Vj, \x(j) — x*(j)\ > \x(j)\ — \x*(j)\ which leads to 

mm \x*(j)\ > mm \x(j) \ - max \x(j) - x*(j)\. 
jes jes jes 

But max|x(j) — x*(j)\ = \\hs\\oo < ll^slli < II^T ||i) so combining the above three equations we get 

mm\x*(j)\ > |x*(fc + l)|+min|a:0-)|-(»? + l)IK-||i. (24) 
jes j&s 



Equation (24 1 says that if the matrix A has <5( 0+1 )/i.-RIP and the signal x obeys 

min|a;(j)| > (^ + 1)11^111, 

then the support T of the largest k entries of the solution x* to ^ contains the support S of the largest s entries 
of the signal x. 



APPENDIX B. PROOF OF THEOREM 3. 

The proof of Theorem 3.3 follows in the same line as our previous work in 15 wit 

m 

that the sets Tj and disjoint and T = [j T, ^ Mn P ftp « P h r.. = T„ n T, 

1—1 

\T ja \ = ajpjk. 

Let x# = x + h be the 



15 with some modifications. Recall 

j a = To n Tj, for all j G {1, . . . , m}, where 

minimizer of the weighted t\ problem (17 1. Then 

ii ™ i u ii ii ™ii . 



||z + /i||i, w < ||;''|| i 

Moreover, by the choice of 



weights in (17), we have 
wi||xf + /if ||i + . . . w m ||xf + /if ||i + ||xf c + hf c \\i < u}i\\Xrp \\i ■ 
Consequently, 



-U m \\xf m \\l + ||a;f c ||l. 



\\x; 



Next 



m / , \ 

- H x f<=nT Hi + iFf-nTflli + E ^ll^nTolli + a; ilFx f nT ( f Hi J ■ 

we use the forward and reverse triangle inequalities to get 

m 

("ill^rwjllij 

j=i 



m 

Hftf.nr.lli < IIV«nTolli+2 w jll ft f J n2blli 

i=i 



l|x fcnT c||i +X) w ill a: f i nT <=lli 



m 

Adding E (1 ~ £dj)||ftf ?nTO ||i on both sides of the inequality above 

3=1 3 

m 

E ll^nrclli + Hft' 
3=1 



we obtain 



m m 

Tcnrelli - S w il|ftftnr lli + EC 1 - ^Ol^ 
j— i 3=1 



'j)||/if 3 n T c||i + l|ftfc n T lli 



Since ||/i T(f ||i = ||ftf nT c||i + l|ftfc nT c||i and ||/i, 



■ to 



m m 

||ftr Hli < Y. UJ ^ h f 3 nT h -^OII^nT^lli + 11% 

7 = 1 7=1 



m \ 

||xf cnT c||i + E ^jll^-nT^lli j • 

m 

; fnT c lli = E l|ftf nT 1 !! 1 ' ^ ms easn Y reduces 

o j=1 1 

(m 



(25) 



Now consider the following term from the left hand side of (25) 

m m 

Z] w jl|ftf 3 -nT lli ~ ^Ollftf.-ni-lli + l|ftfe n r lli 

3=1 3=1 

Add and subtract EJli(l — w j)l|ftf c nT II i' an< ^ smce the set Tj Q = ToHTj, we can write ||ftf Fn T Hi + ll^f nT c II 1 = 

II^Tounf^lli t0 s et 



m / \ m / 

Ewj (llft^nTolli + I^T/nTolliJ + E^-Wj) (11^^111 + H^f/nTolli 

rn \ m rn 

E ^ lift 



rn \ m rn 

E w j l|ftT ||i + l|ftf=nT lli " E l|ftf?nT Hi + E(! - w j)l|ft 

3=1 / 3=1 J 3=1 

m \ m 

E^-m + 1 ||/iT ||i+ E(l-^)l|ft- 

3 = 1 / 3=1 



'T UT\fj„ II 1 



m 

ll^f-nTolli _ E l|ftfpnT Hl 

3=1 3 



' L T UT\T ja 111- 



The last equality comes from 1 1 ^-^-. 1 1 1 = ||/iy cnTn ||i + 



^nrolli^ll'^^Xi)" 1 and ^ ll^r n(f\f 3 )lli 



{m-l)\\h Tn f ||i. 



Consequently, we can reduce the bound on ||/it c ||i to the following expression: 

/ m \ m I m 

\\h T§ \\i< + 1 ||Arol|i + £(l-w i )||ft rD ^ <i ||i+2 ll^n^ Hi + E^H^n^ Hi I • ( 26 ) 



J =1 



Next we follow the technique of Candes et alP and sort the coefficients of hx" partitioning T§ it into disjoint 
sets Tj,j £ {1, 2, . . .} each of size ak, where a > 1. That is, T± indexes the ak largest in magnitude coefficients 
of /it c , T 2 indexes the second ak largest in magnitude coefficients of hrg, and so on. Note that this gives 
h T§ =Y,j>i h Tj, with 

1Mb < v^H^lloo < K)~ 1/2 |IH-Ji- (27) 
Let Tqi = To U Ti, then using (271 and the triangle inequality we have 

\\h TSl h < E IIMa < (a*)- 1/a E IIMi 

j>2 j>i (28) 

< (afe)- 1/2 |IMi- 

Next, consider the feasibility of x# and x. Both vectors are feasible, so we have ||A/i||2 < 2e and 

\\Ah TQ1 h < 2e+||A/i To oJ| 2 <2e+ £ \\Ah Tj h 
< 2e + v / TT^E \\h Ti h. 

J">2 



From (26) and (28) we get 



\\Ah Tol \\ 2 <2e + 2^+§S [\\xf anTS \\i+E 



3=1 



Wj-lix^nrclli 



afc . (E U 3 -TO + l)||/l To ||l + E( 1 -^)ll /l Tounf, c ,lli 
V i =1 i =1 



Noting that |T U T \ T ja \ = (1 + p } - 2a J p j )k, 



^l-5 { a+i)k\\hT 01 h <2e + 2^i ||z f c nT c||i+ £ "ill*? nr* 111 



+ (E^--m + l)||ftr D ||2 + E (1 - wj) V 1 + ft - 2a ^il^T uf\f,JI 



Since for every j we have ||^xbuT \T' II 2 — ll^oilh an d II^Tfalb < H^ToJb, thus 



llfcibilla < 



2e + 2 ^SF (ll^fcnroclli+E^ll^n^lli 



(29) 



Finally, using ||/i|j 2 < ||/it q1 II 2 + II ^t^ II 2 an d let 7 = £ — m + 1 + £ (1 — + pj — 2ajPj, we combine 



(26), (281 and (29) to get 



2 1 



\\hh< 



6 + 2 I W x T°nT S h + £ W^niffl 



y/1 - 6 {a+1)k - + 6 a 



(30) 



with the condition that the denominator is positive, equivalently 8 a k + ^-<5( a +i)fc < ^ — 1- 
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